AITopics | contextual stochastic bandit problem

Collaborating Authors

contextual stochastic bandit problem

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Model Selection in Contextual Stochastic Bandit Problems

Neural Information Processing SystemsDec-24-2025, 04:52:20 GMT

We study bandit model selection in stochastic environments. Our approach relies on a master algorithm that selects between candidate base algorithms. We develop a master-base algorithm abstraction that can work with general classes of base algorithms and different type of adversarial master algorithms. Our methods rely on a novel and generic smoothing transformation for bandit algorithms that permits us to obtain optimal $O(\sqrt{T})$ model selection guarantees for stochastic contextual bandit problems as long as the optimal base algorithm satisfies a high probability regret guarantee. We show through a lower bound that even when one of the base algorithms has $O(\log T)$ regret, in general it is impossible to get better than $\Omega(\sqrt{T})$ regret in model selection, even asymptotically.

algorithm, contextual stochastic bandit problem, model selection, (9 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.63)

Add feedback

Supplement to " Model Selection in Contextual Stochastic Bandit Problems "

Neural Information Processing SystemsOct-3-2025, 06:29:16 GMT

In Section D we present the proofs for Section 5.1 In Section H we show the proofs of the lower bounds in Section 6. We outline briefly some other direct applications of our results. CORRAL will achieve regret O ( p | L | dT) . B.1 Original Corral The original Corral algorithm [2] is reproduced below. We reproduce the EXP3.P algorithm (Figure 3.1 in [ 's expected replay regret satisfies: Therefore total regret is bounded by 6 U ( T,) log( T) D.2 Applications of Proposition 5.1 We now show that several algorithms are ( U,, T) bounded: Lemma D.2.

algorithm, base algorithm, theorem 5, (15 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.41)

Add feedback

Review for NeurIPS paper: Model Selection in Contextual Stochastic Bandit Problems

Neural Information Processing SystemsJan-25-2025, 20:41:10 GMT

Weaknesses: I believe the paper could present more context around the results in sections 4.2 through 4.4. What other results exist in these settings? Similarly, the numerical experiments are not discussed at all and are hard to interpret. I find it hard to draw conclusions from these experiments. I would recommend the description of the algorithm be moved ahead of section 4. Section 4 is hard to judge on the first read as the algorithm generating the claimed results is not yet presented.

contextual stochastic bandit problem, model selection, neurips paper, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.40)

Add feedback

Review for NeurIPS paper: Model Selection in Contextual Stochastic Bandit Problems

Neural Information Processing SystemsJan-25-2025, 20:41:02 GMT

Three referees reviewed the paper, and initially raised several concerns. However, the rebuttal did address and overcome the reviewer's objections, leading to a unanimous final decision to accept the paper.

contextual stochastic bandit problem, model selection, neurips paper

Neural Information Processing Systems

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.40)

Add feedback

Model Selection in Contextual Stochastic Bandit Problems

Neural Information Processing SystemsOct-10-2024, 13:21:06 GMT

We study bandit model selection in stochastic environments. Our approach relies on a master algorithm that selects between candidate base algorithms. We develop a master-base algorithm abstraction that can work with general classes of base algorithms and different type of adversarial master algorithms. Our methods rely on a novel and generic smoothing transformation for bandit algorithms that permits us to obtain optimal O(\sqrt{T}) model selection guarantees for stochastic contextual bandit problems as long as the optimal base algorithm satisfies a high probability regret guarantee. We show through a lower bound that even when one of the base algorithms has O(\log T) regret, in general it is impossible to get better than \Omega(\sqrt{T}) regret in model selection, even asymptotically.

algorithm, contextual stochastic bandit problem, model selection, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.98)

Add feedback